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Abstract 

This paper presents a new method for dynamic tex- 
ture recognition based on spatiotemporal Gabor filters. Dy- 
namic textures have emerged as a new field of investiga- 
tion that extends the concept of self- similarity of texture im- 
age to the spatiotemporal domain. To model a dynamic tex- 
ture, we convolve the sequence of images to a bank of spa- 
tiotemporal Gabor filters. For each response, a feature vec- 
tor is built by calculating the energy statistic. As far as the 
authors know, this paper is the first to report an effective 
method for dynamic texture recognition using spatiotempo- 
ral Gabor filters. We evaluate the proposed method on two 
challenging databases and the experimental results indicate 
that the proposed method is a robust approach for dynamic 
texture recognition. 



1. Introduction 

The vision of animals provides a large amount of infor- 
mation that improves the perception of the world. This in- 
formation is processed into different dimensions, including 
color, shape, illumination, and motion. While most of the 
features provide information about the static world, the mo- 
tion provides essential information for interaction with ex- 
ternal environment. In recent decades, the perception and 
interpretation of motion have attracted a significant inter- 
est in computer vision community [Q31 [T6J Q] El motivated 
by the importance in both scientific and industrial commu- 
nities. Despite significant advances, the motion characteri- 
zation is still an open problem. 

For modeling image sequences, three classes of motion 
patterns have been suggested |[T3l : dynamic texture, activ- 
ities and events. The main difference between them relies 
on the temporal and spatial regularity of the motion field. 
In this work, we aim at modeling dynamic textures, also 



called temporal textures. They are basically texture in mo- 
tion which is an extension of image texture to the spatiotem- 
poral domain. Examples of dynamic texture includes real 
world scenes of fire, flag blowing, sea waves, moving esca- 
lator, boiling water, grass, and steam. 

Existing methods for dynamic texture can be classified 
into four categories according to how they model the se- 
quence of images. Due to the efficient estimation of features 
based on motion (e.g. optical flow), the motion based meth- 
ods (i) are the most popular ones. These methods model 
dynamic textures based on a sequence of motion patterns 
H21 El - F° r modeling dynamic texture at different scales 
in space and time, the spatiotemporal filtering based meth- 
ods (ii) use spatiotemporal filters such as wavelet transform 
El El 13- Model based methods (iii) are generally based on 
linear dynamical systems, which provides a model that can 
be used in applications of segmentation, synthesis, and clas- 
sification (6][3j[T5]l. Based on properties of moving contour 
surfaces, spatiotemporal geometric property based methods 
(iv) extract motion and appearance features from the tan- 
gent plane distribution ifTOl . The reader may consult (4) for 
a review of dynamic texture methods. 

In this paper, we propose a new approach for dynamic 
texture modeling based on spatiotemporal Gabor filters 
fT2l . As far as the authors know, the present paper is the 
first one to model dynamic texture using spatiotemporal Ga- 
bor filters. These filters are basically built using two param- 
eters: the speed v and direction 0. To model a dynamic tex- 
ture, we convolve the sequence of images to a bank of spa- 
tiotemporal Gabor filter built with different values of speed 
and direction. For each response, a feature vector is built by 
calculating the energy statistic. 

We evaluate the proposed method by classifying dy- 
namic texture from two challenging databases: dyn- 
tex |[TTll and traffic database |2|. Experimental results in 
both databases indicate that the proposed method is an ef- 
fective approach for dynamic texture recognition. For 
the dyntex database, filter with low speeds (e.g. 0.1 pix- 



els/frame) achieved better results than high speeds. In fact, 
the dynamic texture in this database presents low mo- 
tion patterns. On the other hand, for the traffic database, 
high speeds (e.g. 1.5 pixels/frame) achieved the best cor- 
rect classification rate. In this database, vehicles are moving 
at a speed that matches the filter's speed. 

This paper is organized as follows. Section 2 briefly 
describes spatiotemporal Gabor filters. In Section 3, we 
present the proposed method for dynamic texture recogni- 
tion based on spatiotemporal Gabor filters. An analysis of 
the proposed method with respect to the speed and direc- 
tion parameters is present in Section 4. Experimental re- 
sults are given in Section 5, which is followed by the con- 
clusion of this work in Section 6. 

2. Spatiotemporal Gabor Filters 

Gabor filters are based on the important finding made 
by Hubel and Wiesel in the beginning of the 1960s. They 
found that the neurons of the primary visual cortex respond 
to lines or edges of a certain orientation in different posi- 
tions of the visual field. Following this discovery, compu- 
tational models were proposed for modeling the function of 
this neurons and the Gabor functions proved to be suited for 
this purpose in many works. 

Initially, the researches aimed at studying spatial proper- 
ties of the receptive field. However, some posterior studies 
revealed that cortical cells change in time and some of them 
are inseparable functions of space and time. Therefore, 
these cells are essentially spatiotemporal filters and they 
combine information over space and time, which makes a 
great model for dynamic texture analysis. 

In this work, the spatiotemporal receptive field is mod- 
eled by a family of 3D Gabor filters lf]~2l described in Equa- 
tion Q] 
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We now discuss the parameters of the spatiotemporal 
Gabor filter. Some parameters were empirically found based 
on studies of response in the receptive visual field fl2l . The 
size of the receptive field is determined by the standard de- 
viation a of the Gaussian factor. The parameter 7 is the rate 
that specifies the ellipticity of the Gaussian envelope in the 
spatial domain. This parameter is set to 7 = 0.5 for match 




Figure 1. Example of spatiotemporal Gabor 
filter for v r = l. 



to the elongated receptive field along y axis. The speed v is 
the phase speed of the cosine factor, which determines the 
speed of motion. The speed which the center of the spatial 
Gaussian moves along the x axis is specified by the param- 
eter v c . When v c = 0, the center of the Gaussian envelope 
is stationary. On the other hand, a moving envelope is ob- 
tained when v c = v. Figure [T] presents a moving envelope 
with v c = v = 1. 

The parameter A is the wavelength of the cosine factor. It 
is obtained through the relation A = Ao y/l + v 2 , where the 
constant Ao = 2. The angle G [0, 2tt] determines the di- 
rection of motion and the spatial orientation of the filter. The 
phase offset tp G [— 7r, tt] determines the symmetry in the 
spatial domain. The Gaussian distribution with mean fi t and 
standard deviation r is used to model the change in intensi- 
ties. The mean fi t = 1.75 and standard deviation r = 2.75, 
both parameters are fixed based on the mean duration of 
most receptive fields. 

3. Dynamic Textures Modeling based on Spa- 
tiotemporal Gabor Filters 

In this section, we describe the proposed method for dy- 
namic texture modeling based on spatiotemporal Gabor fil- 
ters. Briefly, the sequence of images is convolved with a 
bank of spatiotemporal Gabor filters and a feature vector 
is constructed with the energy of the responses as compo- 
nents. 

The response r^ v ^^ (x, y, t) of a spatiotemporal Gabor 
filter g( v ,e,cp) to a sequence of images I(x, y, t) is computed 
by convolution: 



(2) 



Spatiotemporal Gabor filters are phase sensitive because 
its response to a moving pattern depends on the exact posi- 
tion within the sequence of images. To overcome this draw- 
back, a response that is phase insensitive can be obtained 



R (v,o) = sj rf Vi0iip) (x, y, t) + rf Vi6i(p/2) (x, 2/, *) (3) 

To characterize the Gabor space resulting from the con- 
volution, the energy of the response is computed according 
to Equation [4] 

E(v,e) = J2 R Ue)( x ^^) ( 4 ) 

A central issue in applying spatiotemporal Gabor filters 
is the determination of the filter parameters that covers the 
spatiotemporal frequency space, and captures dynamic tex- 
ture information as much as possible. Each spatiotempo- 
ral Gabor filter is determined by two main parameters: the 
direction 6 and the speed of motion v. In order to cover 
a wide range of dynamic textures, we design a bank of 
spatiotemporal Gabor filter using a set of values for speed 
V = [vi,v 2 , - ..,v n ] and direction 6 = [61,62, • • • , 6 m ). 
The feature vector that characterizes the dynamic texture is 
composed by the energy of the response for each combina- 
tion of velocities V and direction 6 (Equation [5]). 

The proposed method is summarized in Figure [2] First, 
we design a bank of spatiotemporal Gabor filters composed 
by filters with different directions and speeds. Then, the se- 
quence of images is convolved with the bank of filters. For 
each convolved sequence of images, we calculate the en- 
ergy to compose a feature vector. 

4. Response Analysis of Spatiotemporal Ga- 
bor Filters 

Here, we analyze the speed and direction properties of 
the spatiotemporal Gabor filters in synthetic sequence of 
images. In Figure [3j we present the response of spatiotem- 
poral Gabor filters to bars moving at the same speed but in 
different direction 6. The filters and the moving bars have 
preference for the same speed v = 1. The response has the 
highest magnitude when the direction of the filter matches 
the direction of the moving bar. For instance, when = 0, a 
vertical bar moving rightwards evokes higher response than 
bars with other direction of movement. 

The speed property is evaluated in Figure]?] We analyze 
the response of spatiotemporal Gabor filters to edges drift- 
ing rightward at different speeds. The filters and the syn- 
thetic sequence of images have for the same preference di- 
rection 6 = 0. The highest response is achieved by filters 
which the speed matches the speed of the moving edge. 

In Figure [5ja), we plot the response of filters to a mov- 
ing bar with direction 6 = at a speed v = 1. The re- 
sponse reaches its maximum value to a filter with direction 
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Figure 2. Proposed method considers the fol- 
lowing steps: (i) Design a bank of spatiotem- 
poral Gabor filters using different values of 
speed v and direction 6; (ii) Convolve the se- 
quence of images with the bank of filters; 
(iii) Calculate the energy for each response 
to compose the feature vector. 



6 = 0, which matches to the direction of the moving bar. 
As we can see, the filter with moving envelope (v c = v) 
achieved higher response than a filter with stationary enve- 
lope (v c = 0). The plot for the speed parameter is shown in 
Figure [5jb). We convolved filters to an edge drifting right- 
ward in direction 6 = at a speed of v = 2. The maximum 
response is achieved for the filter whose speed matches to 
stimulus' speed. Again, we can conclude that filters with the 
moving envelope are more selective for both direction and 
speed than filters with stationary envelope. 

5. Experimental Results 

In this section, we present the experimental results us- 
ing two databases: (i) dyntex database and (ii) traffic video 
database. The dyntex database consists of 50 dynamic tex- 
ture classes each containing 10 samples collected from 
Dyntex database ifTTIl . The videos are at least 250 frames 
long with dimension of 400 x 300 pixels. Figure [6] shows 
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Figure 5. (a) Response for filters of different direction to a moving bar with = 0. (b) Response for 
filters of different speeds v to a edge moving at speed v = 2. 
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Figure 3. Response of spatiotemporal Gabor 
filters to bars moving in different direction 
6. First row corresponds to the filters. Sec- 
ond, third and fourth rows corresponds to 
the response of a bar moving in direction 
= 0, f , f , respectively. 



Figure 4. Response of spatiotemporal Gabor 
filters to edges moving at different speed v. 
First, second, and third rows corresponds to 
the response of a edges moving at speed v = 
1,2,4, respectively. 



examples of dynamic textures from the first database. The 
second database, collected from traffic database (2), con- 
sists of 254 videos divided into three classes: light, medium, 
and heavy traffic. Videos had 42 to 52 frames with a resolu- 
tion of 320 x 240 pixels. The variety of traffic patterns and 
weather conditions are shown in Figure [7] All the experi- 
ments used a k-nearest neighbor classifier with k = 1 in a 
scheme 10-fold cross validation. 

Now, we discuss the influence of the direction and speed 



parameters in the dynamic texture recognition. Table [T] 
shows the average and standard deviation of correct clas- 
sification rate for the traffic database. Columns present the 
direction parameter evaluation using 4-directions, which 
is a combination of filters with 6 = [0, |, 7r, and 
8 -directions which is a combination of filters with = 
[0, f , f , ^,tt, ^f, ^, 7 -f}. Rows present the speed eval- 
uation using combination of speed with step of 0.25. As 
we can see, the bank of filter composed by filters with 8- 
directions outperformed the bank of filter with 4-directions 
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Figure 6. Examples of dynamic textures from 
dyntex database [11 j. The database is com- 
posed by 50 dynamic texture classes each 
containing 10 samples. 




Figure 7. Examples of dynamic textures from 
traffic database [2j. The database consists of 
254 videos split into three classes of light, 
medium and heavy traffic. 



for all combination of speeds. However, very little improve- 
ment can be appreciated as the combination of direction is 
increased. The improvement in correct classification rate 
was on average less than 1% when direction combination 
rises from 4 to 8. This is because the cars on the high- 
way move always on the same direction, which can be mod- 
eled by 4-direction. With respect to the speed parameter, the 
best results were achieved for high speeds, such as 1.5 pix- 
els/frame and 2.0 pixel/frame. A correct classification rate 
of 91.50% was achieved by a bank of filter composed by 
8-directions and speeds [0.5,0.75, 1.0, 1.25, 1.5, 1.75,2.0]. 
The high speeds match to speed of the cars in the sequence 



Speed Combination 


4-directions 


8-directions 


[0.50 - 
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90.06(5.53) 


90.18(5.62) 
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[0.50 - 


1.75] 


89.69(6.18) 


90.56(5.58) 


[0.50 - 


2.00] 


89.45(5.77) 


91.50(5.20) 


[0.50 - 


2.25] 


90.24(5.51) 


90.87(5.07) 


[0.50 - 


2.50] 


89.33(5.60) 


90.75(5.26) 


[0.50 - 


2.75] 


89.76(5.44) 


90.35(5.44) 


[0.50 - 


3.00] 


89.60(5.16) 


90.63(5.21) 



Table 1 . Correct classification rate and stan- 
dard deviation for different combinations of 
speed and direction on the traffic database. 



Speed Combination 


4-directions 


8-directions 


[0.1-0.2] 


92.50(3.40) 


94.92(3.09) 


[0.1-0.5] 


96.00(2.40) 


96.82(2.41) 


[0.1 - 1.0] 


96.56(2.11) 


98.02(1.65) 


[0.1 - 1.5] 


96.92(2.32) 


98.60(1.60) 


[0.1 - 2.0] 


97.24(2.18) 


97.84(1.92) 


[0.1 - 2.5] 


96.92(2.49) 


97.34(2.36) 


[0.1 - 3.0] 


96.37(2.98) 


96.94(2.73) 



Table 2. Correct classification rate and stan- 
dard deviation for different combinations of 
speed and direction on the dyntex database. 



of images and then the traffic jam can be modeled using 
these parameters. 

In Table [2] we present the experimental results obtained 
on the dyntex database. The same combination of directions 
of the early experiment was used to evaluate the proposed 
method. However, as the dynamic textures in this database 
present low speeds, the combination of speed started in 
v = 0.1 pixels/frame and taken step of 0.1. As the previous 
results, the 8-direction bank of filters achieved higher val- 
ues of correct classification rate compared to the 4-direction 
bank of filters. In this case, a correct classification rate of 
98.60% was obtained, which clearly shows the effective- 
ness of the proposed method on the dynamic texture recog- 
nition. 

6. Conclusion 

In this paper, we proposed a new method for dynamic 
texture recognition based on spatiotemporal Gabor filters. 
First, it convolves a sequence of images to a bank of filters 



and then extracts energy statistic from each response. Basi- 
cally, the spatiotemporal Gabor filters are built using speed 
v and direction 0. The bank of filter is composed by filters 
with different values of speed and direction. 

Promising results considering different combinations of 
v and 6 were achieved on two important databases: dyn- 
tex and traffic database. On the traffic database, our method 
achieved a correct classification rate of 91.50% using com- 
bination of high speeds. On the other hand, a correct classi- 
fication rate of 98.60% was obtained on the dyntex database 
using a combination of low speeds. 
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